Running head : Durational Cues to Word Recognition in Spoken French 1 Durational Cues to Word Recognition in Spoken
نویسنده
چکیده
In spoken French, the phonological processes of liaison and resyllabification can render word and syllable boundaries ambiguous (e.g. un air ‘an air’ / un nerf ‘a nerve’, both [ɛ ̃.nɛʁ]). Production data have demonstrated that speakers of French vary the duration of consonants that surface in liaison environments relative to consonants produced word-initially (WauquierGravelines 1996; Spinelli et al. 2003). Further research has suggested that listeners exploit these durational differences in the processing of running speech (Gaskell et al., 2001; Spinelli et al., 2003), though no study to date has directly tested this hypothesis. The current study examines the exploitation of duration in word recognition processes by manipulating this single acoustic factor while holding all other factors in the signal constant. The pivotal consonants in potentially ambiguous French sequences (e.g. /n/ in un nerf) were instrumentally shortened and lengthened and presented to listeners in two behavioral tasks. Results suggest that listeners are sensitive to segmental duration and use this information to modulate the lexical interpretation of spoken French. ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 3 The syllable has been shown to play a prominent role in the processing of continuous speech in French (Mehler, Dommergues, Frauenfelder, & Segui, 1981; Cutler, Mehler, Norris, & Segui, 1989). However, the frequent misalignment of syllable and word boundaries due to the phonological process of enchaînement (resyllabification) and liaison (linking) is problematic for syllable-based lexical access models which assume that the edges of words and syllables tend to coincide. Enchaînement occurs when a consonant-final word (W1) is followed by a vowel-initial (V-initial) word (W2). The coda of W1 is resyllabified across the word boundary to become the onset of W2. The phrase une amie ‘a friend’ (feminine) is thus produced as [y.na.mi] where syllable and word boundaries are mismatched, instead of [yn.a.mi] where boundaries would be aligned. Liaison on the other hand concerns consonants in final position that are represented graphically, but are latent when the word is produced in isolation or before a consonant-initial (C-initial) word. The latent consonant is realized however before a V-initial word and then resyllabified through enchaînement. For example, the singular masculine indefinite article un is pronounced [ɛ ̃] in isolation or before a consonant (e.g., un stylo [ɛ ̃.sti.lo] ‘a pen’), however, when preceding a vowel onset in W2, as in un ami ‘a friend’ (masculine), the latent /n/ surfaces and is syllabified as the onset of ami. Accordingly, the sequence is syllabified [ɛ ̃.na.mi] instead of [ɛ ̃n.a.mi] where word boundaries would be respected. As a result, boundaries between syllables and words are often blurred (e.g., un air ‘an air’ and un nerf ‘a nerve’, both [ɛ ̃.nɛʁ]). Boundary misalignment has been shown to incur processing costs in French. Using a word-spotting task, Dumay, Frauenfelder and Content (2002) showed that reaction times were 1 Except in cases of epenthetic liaison, as in quatre [z] enfants ‘four children’, in which a liaison consonant is introduced in production to resolve hiatus, but is not represented in the orthography of the W1. ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 4 significantly faster in identifying the word lac ‘lake’ embedded in the non-word zun.lac, where the illicit onset /nl/ forces a syllable boundary between /n/ and /l/, than in zu.glac, where /ɡl/ is an allowed onset and therefore word and syllable boundaries are not necessarily aligned. It is the mismatch of word and syllable boundaries in the latter that these authors propose to be slowing reaction times. This work is consistent with research in numerous languages suggesting that listeners are sensitive to the fact that syllable boundaries generally coincide with word boundaries and exploit this fact in spoken word recognition (Content, Kearns, & Frauenfelder, 2001; Content, Meunier, Kearns, & Frauenfelder, 2001; Cutler & Norris, 1988; Norris, McQueen, Cutler, & Butterfield, 1997; Vroomen & de Gelder, 1997). Given the prominent role of the syllable in French, the prevalence of liaison and resyllabification would presumably hinder speech processing and impede access to mental representations. Competition-based models of spoken word recognition (e.g. McClelland & Elman, 1986; Norris, 1994) propose that a set of candidate words consistent with acoustic (bottom-up) cues are simultaneously activated in a listener’s mental lexicon as the input is processed in real time. The competition process concludes when an optimal parse is achieved and the acoustic signal is segmented into non-overlapping words. In the case of liaison, however, multiple lexical interpretations can account for the same phonetic sequence (e.g. [ɛ̃.nɛʁ]), and therefore competition processes are assumed to be hindered. A study by Gaskell, Spinelli, and Meunier (2002), however, suggested that resyllabification in French can in fact facilitate the recognition of V-initial words. A cross-modal priming study showed significant priming effects as compared to a control for V-initial words in three conditions: liaison (e.g., un généreux italien [ɛ ̃.ʒe.ne.ʁø.zi.ta.ljɛ ̃] ‘a generous Italian’), ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 5 enchaînement (e.g., un virtuose italien [ɛ ̃.viʁ.tɥo.zi.ta.ljɛ ̃] ‘an Italian virtuoso’) and a syllablealigned condition (e.g., un chapeau italien [ɛ ̃.ʃa.po.i.ta.ljɛ ̃] ‘an Italian hat’). Reaction times were measured and participants recognized V-initial targets preceded by resyllabified consonants, in both liaison (e.g. un généreux italien) and enchaînement (e.g. un virtuose italien) conditions, significantly faster than targets that matched syllable boundaries (e.g. un chapeau italien). The authors’ suggestion was that resyllabification is somehow acoustically marked and that these acoustic cues facilitate the lexical competition process. This study extended previous findings by Wauquier-Gravelines (1996), who showed in a word-monitoring task that éléphant ‘elephant’ is recognized as readily in un petit éléphant [ɛ ̃.pəә.ti.te.le.fɑ ̃] ‘a small elephant’, a liaison environment, as in un joli éléphant [ɛ ̃.ʒo.li.e.le.fɑ ̃], where no liaison is possible and boundaries are aligned. Spinelli, McQueen, and Cutler (2003) also found that perceptual efficacy and processing are not hindered by liaison and resyllabification. These authors probed lexical access processes and revealed significant priming effects for both C-initial and V-initial words in globally ambiguous sentence pairs such as c’est le dernier rognon, ‘it’s the last kidney’, and c’est le dernier oignon, ‘it’s the last onion’, both [se.ləә.dɛʁ.nje.ʁɔ.ɲɔ ̃]. Spinelli et al. employed four priming conditions in a lexical-decision task: an ambiguous liaison condition (c’est le dernier oignon), an ambiguous non-liaison condition (c’est le dernier rognon), an unambiguous condition where liaison would not be possible (c’est un demi rognon, ‘It’s a half kidney’), and 2 As pointed out by a reviewer of this article, one could posit an alternate explanation for the facilitation effects in that the third condition (e.g. un chapeau italien) represents a situation of hiatus, which could be more difficult to process than a consonantal onset. ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 6 finally an unambiguous baseline condition including an unrelated word where liaison would not be possible (c’est un ancien nitrate, ‘It’s an old nitrate’). Priming effects were observed for both V-initial (oignon) and C-initial (rognon) candidates in the ambiguous conditions. In other words, the ambiguity caused by liaison and subsequent resyllabification did not impair the lexical activation of the V-initial candidate. Furthermore, priming effects followed the intention of the speaker, i.e. priming effects were stronger for oignon than for rognon when the speaker intended oignon, and vice versa. Their results also suggested that words not intended by the speaker in ambiguous contexts (e.g., oignon when dernier rognon is produced) were activated, but not as strongly as in the intended production. Significantly, they did not find priming effects for oignon in the unambiguous condition where liaison is not possible (e.g., demi rognon), suggesting that solely the liaison environment allows for the activation of both consonantand V-initial lexical candidates. Spinelli et al. (2003) also invoke allophonic variation to account for the observed priming effects. They hypothesize that listeners exploit “subtle but reliable” (p. 248) variations in segmental duration to locate word boundaries and that access to mental representations is facilitated by these cues, concluding that durational differences are robust enough to “bias interpretation in the correct direction” (p. 250). In line with this hypothesis are French production data which have revealed significant differences in duration between liaison consonants (LC; e.g., /n/ in un air) and initial consonants (IC; e.g., /n/ in un nerf). These same authors found significant durational differences among five consonants that surface in liaison /n, t, ʁ, ɡ, p/. Liaison consonants were on average 17% shorter than initial consonants. Measurements of the pivotal consonants revealed that ICs were on average 10 ms longer (difference range= 6 to 12 ms) than word-final, resyllabified consonants (however these authors did not report specific ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 7 differences between LCs and ICs for each of the five segments tested). Wauquier-Gravelines (1996) found similar results for /t/, which had an average closure duration of 50 ms in liaison position and 70 ms in initial position, though she did not find significant durational differences between liaison and word-initial /n/ (58 ms versus 61 ms, respectively). Gaskell et al. (2002) also found durational differences; the segments /t/, /ʁ/ and /z/ were significantly shorter when realized in liaison (mean 73 ms) than in word-initial position (mean 88 ms). Even more recently, Douchez and Lancia (2008) measured linguopalatal contact and found evidence of more sustained contact between the tongue and the palate in the articulation of ICs than LCs. This line of research in French is supported by data showing that speakers in general tend to strengthen the articulation of segments at the edges of prosodic domains (Cho & Keating, 2001; Cho, McQueen, & Cox, 2007). Specifically, initial segments have been shown to be systematically longer than the same segment in medial or final position in English (Lehiste, 1961, 1972; Klatt, 1976; Gow & Gordon, 1995; Fougeron & Keating, 1997), French (Fougeron, 2001), Dutch (Shatzman & McQueen, 2006) and Italian (Tabossi, Collina, Mazzetti, & Zoppello, 2000). However, though systematic durational differences have been found, they may not be robust enough in natural speech to allow listeners to disambiguate spoken French. Using the same recordings of globally ambiguous phrases used in the Spinelli et al. (2003) study, Shoemaker and Birdsong (2008) more directly tested the perception of liaison by employing a forced-choice identification task in which 15 native speakers of French and 15 late learners of L2 French were asked to differentiate ambiguous phonemic content, e.g. [il.na.o.kɛ ̃.nɛʁ] produced as either Il n’a aucun air or Il n’a aucun nerf. ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 8 Participants in both groups performed at chance (native speaker mean accuracy, 53.2 %; non-native speaker mean accuracy, 52.7%) suggesting that, though durational differences may allow for the activation of V-initial candidates in the word recognition process, these differences are not sufficiently robust to systematically guide listeners in disambiguation. More recently, Tremblay (2011) used eye-tracking to investigate the parsing of locally ambiguous realand non-word sequences containing the pivotal consonant /z/ by both native and non-native speakers of French. Results from this study suggest that distributional cues (i.e. the frequency with which /z/ appears in liaison position relative to lexical-word initial position) influenced processing more than durational differences between /z/ produced in liaison (e.g. fameux élan ‘infamous swing’) and /z/ produced as a lexical onset (e.g. fameux zélé ‘infamous zealous one’). Put differently, acoustic cues were not sufficient to guide listeners’ eye fixations when the speech signal required temporary disambiguation. The exploitation of durational variation in word recognition has been demonstrated in numerous languages. Davis, Marslen-Wilson, and Gaskell (2002) were among the first to demonstrate that lexical access processes in English are sensitive to subtle durational differences linked to prosodic boundaries, specifically between a mono-syllabic word and the same phonemic sequence embedded in a longer word (e.g. cap and captain). This study employed a cross-modal priming task in which participants heard a sequence such as /kæp/ produced either as mono-syllabic cap or as the first syllable of captain. The results showed differential activation for the two productions, with more activation of the shorter word when participants were presented with mono-syllabic productions, and conversely more activation of the longer word when participants were presented with a portion of the disyllabic sequence. ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 9 More recently, Shatzman and McQueen (2006) found that the online interpretation of ambiguous sequences in Dutch can be influenced by segment duration alone. This study investigated the recognition of sequences rendered ambiguous by the lexical assignment of /s/ (e.g. eens pot ‘once jar’ and een spot ‘a spotlight’). In a production sample, they found variation in several acoustic factors in the ambiguous pairs including segment duration, closure duration of the stop consonant, the duration of the entire word, root mean square energy of /s/, and root mean square energy of the stop consonant following /s/. However, the duration of the segment /s/ alone was significantly predictive of participants’ performance on an identification task that also tracked eye movements. In a second experiment in this study, the researchers instrumentally manipulated the duration of /s/ by both shortening and lengthening the segment. Even stronger correlations between responses and the duration of /s/ were found with these manipulated stimuli, confirming that duration was a sufficient and reliable cue to segmentation. In the case of spoken French, durational differences between consonants that surface in liaison environments and the same consonant in initial position have been demonstrated in production, however no research to date has confirmed that these durational differences influence lexical interpretation. Thus Spinelli et al.’s (2003) suggestion that duration influences lexical access in French remains conjectural as their study did not demonstrate directly that duration was guiding participants’ responses. Moreover, Shoemaker and Birdsong (2008) did not find significant correlations between segment duration and response patterns using the same stimuli in an identification task. The current study addresses this research gap by using French stimuli in which this single acoustic factor is manipulated in the same physical utterance while holding all other acoustic factors constant. In this way it is possible to directly test listeners’ exploitation of durational ha ls hs -0 06 83 60 7, v er si on 1 29 M ar 2 01 2 Running head: Durational Cues to Word Recognition in Spoken French 10 variation in cases of potential lexical ambiguity. To this end, the current study includes both an AX discrimination task and a forced-choice identification task employing stimuli in which the pivotal consonants in ambiguous phrases (i.e., /n/ in [ɛ ̃.nɛʁ], un air or un nerf) are instrumentally shortened and lengthened while the rest of the utterance remains unaltered. An AX discrimination task is used to tap lower-level acoustic processing and is motivated by the assumption that segmental duration represents an effective cue to segmentation and lexical access only to the extent that this cue is perceptually salient to listeners. A forced-choice identification task is used to investigate the exploitation of segmental duration in higher-level lexical decision processes by determining the extent to which segmental duration modulates lexical interpretation. Specific questions asked about the processing of liaison in French are thus the following: (1) What are the thresholds of perceptual saliency with respect to durational differences between consonants that surface in liaison and word-initial consonants for speakers of French? (2) Are acoustic differences sufficient to override ambiguity in globally ambiguous liaison contexts? (3) To what extent does segmental duration modulate lexical access and the segmentation routines of speakers of French?
منابع مشابه
Using durational cues in a computational model of spoken-word recognition
Evidence that listeners use durational cues to help resolve temporarily ambiguous speech input has accumulated over the past few years. In this paper, we investigate whether durational cues are also beneficial for word recognition in a computational model of spoken-word recognition. Two sets of simulations were carried out using the acoustic signal as input. The simulations showed that the comp...
متن کاملModeling the use of durational information in human spoken-word recognition.
Evidence that listeners, at least in a laboratory environment, use durational cues to help resolve temporarily ambiguous speech input has accumulated over the past decades. This paper introduces Fine-Tracker, a computational model of word recognition specifically designed for "tracking" fine-phonetic information in the acoustic speech signal and using it during word recognition. Two simulations...
متن کاملمدلسازی بازشناسی واجی کلمات فارسی
Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...
متن کاملPhonotactic and acoustic cues for word segmentation in English
This study investigates the influence of both phonotactic and acoustic cues on the segmentation of spoken English. Listeners detected embedded English words in nonsense sequences (word spotting). Words aligned with phonotactic boundaries were easier to detect than words without such alignment. Acoustic cues to boundaries could also have signaled word boundaries, especially when word onsets lack...
متن کاملDo different boundary types induce subtle acoustic cues to which French listeners are sensitive?
This paper examines the production of perception of three types of phonological boundaries. In the first part, we extended our previous acoustic analysis to confirm that French speakers mark word and syllables boundaries differently in enchaînement sequences. The durational properties of vowels and consonants were compared in 3 boundary conditions: (A) enchaînement (V1C#V2), (B) wordinitial con...
متن کامل